A (2/3)n3 Fast-Pivoting Algorithm for the Gittins Index and Optimal Stopping of a Markov Chain
نویسنده
چکیده
T paper presents a new fast-pivoting algorithm that computes the n Gittins index values of an n-state bandit—in the discounted and undiscounted cases—by performing 2/3 n3 +O n2 arithmetic operations, thus attaining better complexity than previous algorithms and matching that of solving a corresponding linearequation system by Gaussian elimination. The algorithm further applies to the problem of optimal stopping of a Markov chain, for which a novel Gittins-index solution approach is introduced. The algorithm draws on Gittins and Jones’ (1974) index definition via calibration, on Kallenberg’s (1986) proposal of using parametric linear programming, on Dantzig’s simplex method, on the Varaiya et al. (1985) algorithm, and on the author’s earlier work. This paper elucidates the structure of parametric simplex tableaux. Special structure is exploited to reduce the computational effort of pivot steps, decreasing the operation count by a factor of three relative to conventional pivoting, and by a factor of 3/2 relative to recent state-elimination algorithms. A computational study demonstrates significant time savings against alternative algorithms.
منابع مشابه
A Generalized Gittins Index for a Markov Chain and its Recursive Calculation
We discuss a generalization of the classical Gittins Index for a Markov chain and propose a transparent recursive algorithm for its calculation. The foundation for this algorithm is a modified version of the Elimination algorithm proposed earlier by the author to solve the problem of optimal stopping of a Markov chain in discrete time and a finite or countable state space.
متن کاملOptimal Stopping of Markov Chain and Three Abstract Optimization Problems
There is a well known connection between three problems related to Optimal Stopping of Markov Chain and the equality of three corresponding indices: the classical Gittins index in the Ratio Maximization Problem, the Kathehakis-Veinot index in a Restart Problem, and Whittle index in a family of Retirement Problems. In [13] these three problems and these three indices were generalized in such a w...
متن کاملOne-armed bandit models with continuous and delayed responses
One-armed bandit processes with continuous delayed responses are formulated as controlled stochastic processes following the Bayesian approach. It is shown that under some regularity conditions, a Gittins-like index exists which is the limit of a monotonic sequence of break-even values characterizing optimal initial selections of arms for finite horizon bandit processes. Furthermore, there is a...
متن کاملRobust Control of the Multi-armed Bandit Problem
We study a robust model of the multi-armed bandit (MAB) problem in which the transition probabilities are ambiguous and belong to subsets of the probability simplex. We first show that for each arm there exists a robust counterpart of the Gittens index that is the solution to a robust optimal stopping-time problem. We then characterize the optimal policy of the robust MAB as a project-by-projec...
متن کاملContinue, quit, restart probability model
We discuss a new applied probability model: there is a system whose evolution is described by a Markov chain (MC) with known transition matrix on a discrete state space and at each moment of a discrete time a decision maker can apply one of three possible actions: continue, quit, and restart MC in one of a finite number of fixed “restarting” points. Such a model is a generalization of a model d...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- INFORMS Journal on Computing
دوره 19 شماره
صفحات -
تاریخ انتشار 2007